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IN THE CLAIMS: 

Please revise the claims as follows: 

1 . (Previously presented) A method of converting a document corpus containing an ordered 
plurality of documents into a compact representation in memory of occurrence data, said 
method comprising: 

developing a first vector for said entire document corpus, said first vector being a 
listing of integers corresponding to terms in said documents such that each said document in 
said document corpus is sequentially represented in said listing. 

2. (Previously presented) The method of claim 18, further comprising: 

developing a third vector for said entire document corpus, said third vector 
comprising a sequential listing of floating point multipliers, each said floating point 
multiplier representing a document normalization factor. 

3. (Previously presented) The method of claim 18, further comprising: 

rearranging, in said first vector, an order of said unique integers within the data for 
each said document so that all identical unique integers are adjacent. 

4. (Original) The method of claim 2, wherein said normalization factor is calculated as: 

NF = 1/ (S Xj 2 ) 1/2 , where Xi is the number of occurrences of a specific term in said 
document, so that NF represents the reciprocal of the square root of the sum of squares of all 
term occurrences in said document. 
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5. (Previously presented) A method of converting, organizing, and representing in a 
computer memory a document corpus containing an ordered plurality of documents, said 
method comprising: 

for said document corpus, taking in sequence each said ordered document and 
developing a first uninterrupted listing of integers to correspond to an occurrence of terms in 
the document corpus. 

6. (Previously presented) The method of claim 19, further comprising: 

developing a third uninterrupted listing for said entire document corpus, said third 
uninterrupted listing containing a sequential listing of floating point multipliers, each said 
floating point multiplier representing a document normalization factor for a corresponding 
document in said document corpus. 

7. (Previously presented) The method of claim 19, further comprising: 

for each said document in said document corpus, rearranging said unique integers so 
that any identical integers are adjacent. 

8. (Original) The method of claim 6, wherein said normalization factor is calculated as: 

NF = 1/ (E Xi 2 ) 1/2 , where x; is the number of occurrences of a specific term in said 
document, so that NF represents the reciprocal of the square root of the sum of squares of all 
term occurrences in said document. 
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9. (Currently amended) An apparatus for organizing and representing in a computer memory 
a document corpus containing an ordered plurality of documents, said apparatus comprising: 

an integer determining module receiving in sequence each said ordered document of 
said document corpus and developing a first uninterrupted listing of said unique integers to 
correspond to an occurrence of said dictionary terms in the document corpus. 

10. (Original) The apparatus of claim 9, further comprising: 

a normalizer developing a third uninterrupted listing for said entire document corpus, 
containing a sequential listing of floating point multipliers, each said floating point multiplier 
representing a document normalization factor for a corresponding document in said 
document corpus. 

1 1 . (Original) The apparatus of claim 9, further comprising: 

a rearranger rearranging said unique integers so that any identical integers for each 
said document in said document corpus are adjacent. 

12. (Original) The apparatus of claim 10, wherein said normalizer calculates said 
normalization factor as: 

NF = 1/ (I Xi 2 ) 1/2 , where Xj is the number of occurrences of a specific term in said 
document, so that NF represents the reciprocal of the square root of the sum of squares of all 
term occurrences in said document. 
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13. (Currently amended) A signal-bearing medium tangibly embodying a program of 
machine-readable instructions executable by a digital processing apparatus to perform a 
method to organize and represent in a computer memory a document corpus containing an 
ordered plurality of documents, said method comprising: 

developing a first uninterrupted listing of said unique integers to correspond to the 
occurrence of 3aid dictionary terms in the document corpus. 

14. (Previously presented) The signal-bearing medium of claim 25, wherein said method 
further comprises: 

developing a third uninterrupted listing for said entire document corpus, containing a 
sequential listing of floating point multipliers, each said floating point multiplier representing 
a document normalization factor for a corresponding document in said document corpus. 

15. (Currently amended) A data converter for organizing and representing in a computer 
memory a document corpus containing an ordered plurality of documents, for use by a data 
mining applications program requiring occurrence-of-terms data, said representation to be 
based on terms in a dictionary previously developed for said document corpus and wherein 
each said term in said dictionary has associated therewith a corresponding unique integer, 
said data converter comprising: 

means for developing a first uninterrupted listing of said unique integers to 
correspond to the occurrence of said dictionary terms in the document corpus and; and 

means for developing a second uninterrupted listing for said entire document corpus 
containing in sequence the location of each corresponding document in said first 
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uninterrupted listing, wherein said first listing and said second listing are provided as input 
data for said data mining applications program. 

16. (Original) The data converter of claim 15, further comprising: 

means for developing a third uninterrupted listing for said entire document corpus, 
containing a sequential listing of floating point multipliers, each said floating point multiplier 
representing a document normalization factor for a corresponding document in said 
document corpus. 

17. (Original) The data converter of claim 15, further comprising: 

means for rearranging said unique integers so that any identical integers for each said 
document in said document corpus are adjacent. 

18. (Previously presented) The method of claim 1, further comprising: 

developing a dictionary comprising said terms contained in said document corpus; 

and 

associating, with each said dictionary term, an integer to be uniquely corresponding to 
said dictionary term, said uniquely corresponding integers being said integers comprising said 
first vector. 

19. (Previously presented) The method of claim 1, further comprising: 

developing a second vector for said entire document corpus, said second vector 
indicating the location of each said document's representation in said first vector. 
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20. (Previously presented) The method of claim 5, further comprising: 

developing a dictionary comprising terms contained in said document corpus; and 
associating, with each said dictionary term, an integer to be uniquely corresponding to 

said dictionary term, said uniquely corresponding integers used in said first uninterrupted 

listing. 

21. (Previously presented) The method of claim 5, further comprising: 

developing a second uninterrupted listing for said entire document corpus, said 
second uninterrupted listing containing, in sequence, the location of each corresponding 
document in said first uninterrupted listing. 

22. (Previously presented) The apparatus of claim 9, further comprising: 

a dictionary developing module to develop a dictionary of terms contained in said 
document corpus, each said term being associated with a corresponding unique integer. 

23. (Previously presented) The apparatus of claim 9, further comprising: 

a locator module developing a second uninterrupted listing for said entire document 
corpus, said second uninterrupted listing containing, in sequence, the location of each 
corresponding document in said first uninterrupted listing. 

24. (Previously presented) The signal -bearing medium of claim 13, wherein said method 
further comprises: 
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developing a dictionary comprising terms contained in said document corpus; and 
associating, with each said dictionary term, an integer to be uniquely corresponding to 

said dictionary term, said uniquely corresponding integers used in said first uninterrupted 

listing. 

25. (Previously presented) The signal-bearing medium of claim 13, wherein said method 
further comprises: 

developing a second uninterrupted listing for said entire document corpus, said 
second uninterrupted listing containing, in sequence, the location of each corresponding 
document in said first uninterrupted listing. 
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